Introduction

Achieving cholera elimination requires adequate, representative data to inform intervention policies. In 2014, IEDCR and icddr,b established a national cholera surveillance system in Bangladesh. However, we do not know whether high-risk cholera areas are captured by this system.

In 1988, the US Centers for Disease Control and Prevention (CDC) published Guidelines for Evaluating Public Health Surveillance Systems (updated in 2001), which aimed to efficiently and effectively standardize evaluations of public-health surveillance systems using a series of broad characteristics, including representativeness and sensitivity. Representativeness is defined as the accurate description of cases over time and their distribution in a population by place and person and sensitivity is defined as the proportion of true cases or outbreaks detected by the surveillance system. Yet, to measure both indicators and validate the data collected by the surveillance system, external data are required to compare and determine the true incidence of disease in the population. Typically, such data includes medical records and registries, which rarely exist or are incomplete in low-resource settings.

Given recent estimates of V. cholerae seroincidence from a nationally-representative cross-sectional serosurvey conducted in 2015, we sought to describe the representativeness and sensitivity of the cholera surveillance system using geographically-resolved infection data. We identify how well the Bangladesh national cholera surveillance system captures

  1. the Bangladeshi population
  2. the Bangladeshi population living in high, medium, and low risk cholera areas

to determine which surveillance sites may be most efficiently used to deliver new interventions.

Hospitals in the national cholera surveillance system and the icddr,b Dhaka hospital are the only healthcare facilities that regularly perform laboratory confirmation of V. cholerae in Bangladesh. Consequently, it is important to understand how well areas with high cholera risk are captured by the surveillance system.

Methods

Identifying the cholera surveillance zone

There are 23 hospital sites that perform laboratory confirmation of V. cholerae in Bangladesh (Table 1).

Table 1: Sentinel hospital IDs and locations.

ID Hospital Division Type
1 District Hospital Norshingdi Dhaka district
2 Adhunik Sadar Hospital Habiganj Sylhet district
3 District Sadar Hospital Cox’s Bazar Chittagong district
4 Adhunik Sadar Hospital Naogaon Rajshahi district
5 General Hospital Patuakhali Barisal tertiary
6 Adhunik Sadar Hospital Thakurgaon Rangpur district
7 District Sadar Hospital Satkhira Khulna district
8 Dhaka Medical College Dhaka Dhaka tertiary
9 Uttara Adhunik Medical College Hospital Dhaka tertiary
10 Bangladesh Institute of Tropical and Infectious Diseases Chittagong Chittagong tertiary
11 General Hospital Tangail Dhaka district
12 General Hospital Narayanganj Dhaka district
13 Sadar Hospital Chuadanga Khulna district
14 General Hospital Meherpur Khulna district
15 General Hospital Comilla Chittagong district
16 Upazila Health Complex Chaugachha Jesssore Khulna subdistrict
17 General Hospital Kusthia Khulna district
18 Upazila Health Complex Madan Mymensingh subdistrict
19 Upazila Health Complex Chhatak Sunamganj Sylhet subdistrict
20 Upazila Health Complex Mathbariya Barisal subdistrict
21 Upazila Health Complex Bakerganj Barisal subdistrict
22 Health Complex Shibganj Rajshahi subdistrict
23 icddr,b Cholera Hospital Dhaka icddrb

In the absence of better data on health care utilization of the hospital sentinel surveillance sites, we assumed that the catchment areas of subdistrict, district, tertiary care, and the icddr,b Dhaka hospitals could be defined by a radii of 10-20-30-30km around each hospital (Figure 1). We refer to the joint set of buffers around all 23 hospitals as the “cholera surveillance zone” in Bangladesh.

Figure 1: Map of 10-20-30-30km buffers around subdistrict-district-tertiary-icddr,b sentinel hospital sites, respectively.

Identifying greyspots in the distribution of V. cholerae

We use modeled V. cholerae estimates of seroincidence which estimate the risk of infection within the previous year relative to the population-weighted mean across a 5 km x 5 km grid of Bangladesh. These estimates were based on a nationally-representative serosurvey of Bangladesh conducted in 70 communities in 2015. We measure uncertainty by entropy which is a measure of how uncertain we are that a seroincidence hotspot (RR > 2) is a true hotspot and that a coldspot (RR < 2) is truly not a hotspot (Shannon entropy adapted by https://www.medrxiv.org/content/10.1101/2020.01.10.20016964v1.full.pdf). Our sensitivity analysis shows… [Reason why we chose RR of 2 as the threshold for entropy]

Higher entropy corresponds to high uncertainty that a grid-cell is truly a hotpot or coldspot while lower entropy corresponds to certainty. Areas that have both large uncertainty in our seroincidence estimates and that are outside of the cholera surveillance zone are “greyspots,” locations about which no current cholera-related information is known.

[Edit entropy figure to highlight sero sites that are high in seroincidence and high in entropy]

Defining risk of infection with V. cholerae

Although we have great uncertainty about the relative seroincidence risk across Bangladesh, our previous modeling efforts represent the best information we have about cholera risk in a given location. We used these modeling outputs to characterize the disease risk of populations living in V. cholerae greyspots.

Using the modeled seroincidence risk of infection with V. cholerae, we calculate three quantitative measures at the 5 km x 5 km grid cell level to represent risk of infection:

  1. Relative risk: Median seroincidence risk at the grid cell level relative to the population-weighted mean seroincidence risk across Bangladesh. Relative risk of 1 indicates that the grid cell seroincidence risk is the same as the mean risk across the country.
  2. Proportion infected: Median estimated proportion of the grid cell population that was infected with V. cholerae in the last year.
  3. Absolute infections: Median estimated number of the grid cell population infected with V. cholerae in the last year.

For each of these measures, we identified thresholds by which we could partition grid cells into “high,” “moderate,” and “low” risk.

Relative risk

We examined the distribution of relative risk and the width of the relative risk confidence interval estimates to determine the range for identifying appropriate cutoffs.

## [1] "***** What is the distribution of relative risk estimates? *****"

## [1] "summary stats on median relative risk: deciles"
##        0%       10%       20%       30%       40%       50%       60%       70% 
## 0.1878425 0.5435891 0.6453819 0.7188687 0.7860784 0.8562672 0.9297886 1.0088887 
##       80%       90%      100% 
## 1.1000150 1.3103767 3.2647244
## [1] "What is the range of the relative risk estimates, where range is the width of the 95% CI?"

## [1] "summary stats on 1/2 range of RR: deciles"
##        0%       10%       20%       30%       40%       50%       60%       70% 
## 0.4641809 1.0576451 1.2435192 1.3503377 1.4407787 1.5218541 1.6012463 1.6837145 
##       80%       90%      100% 
## 1.7904077 1.9615904 2.7042683
## [1] "Decision: Confidence interval ranges are too large to be useful. Arbitrarily choose relative risks of 0.8 and 1.2 as cutoffs"

We found that the confidence interval estimates were too large to be useful for this purpose. There was very high uncertainty in the relative seroincidence risk. Instead, we arbitrarily chose the relative risks of 0.8 and 1.2 as cutoffs for moderate and high risks.

Proportion infected

We examined the distribution of the median proportion of the population infected and chose the 30th and 70th percentiles as cutoffs for moderate and high risk.

## [1] "***** What is the distribution of median proportion (of the population) infected within the last year? *****"

## [1] "summary stats on median proportion infected: deciles"
##         0%        10%        20%        30%        40%        50%        60% 
## 0.03595797 0.10215108 0.12106694 0.13596771 0.14978710 0.16324129 0.17732849 
##        70%        80%        90%       100% 
## 0.19204748 0.21129631 0.25296893 0.63651462
## [1] "Decision: Use 30th and 70th percentiles as cutoffs"

Absolute infections

We examined the distribution of the median number of absolute infections on the log10 scale and chose the 30th and 70th percentiles as cutoffs for moderate and high risk.

## [1] "***** What is the distribution of median infections within the last year? *****"

## [1] "summary stats on median infections: deciles"
##            0%           10%           20%           30%           40% 
##      3.995863    251.563380    926.230252   2040.131620   2700.922748 
##           50%           60%           70%           80%           90% 
##   3291.225551   3954.858512   4758.848247   5804.953773   7447.174971 
##          100% 
## 252705.328285
## [1] "Decision: Use 30th and 70th percentiles as cutoffs"

Results

Population in the cholera surveillance zone

We overlaid the cholera surveillance zone over population data in Bangladesh (original source: 2015 100m WorldPop estimates) to estimate the number of people living within the cholera surveillance zone (Table 2, Figure 3).

Table 2: Population living in the cholera surveillance zone.

Buffer (Subdistrict-District-Tertiary-icddr,b in km) Pop Proportion
10-20-30-30 50887409 0.312939

Identifying greyspots outside the cholera surveillance zone

Areas both that have large uncertainty in our seroincidence estimates and that are outside of the cholera surveillance zone are “greyspots,” locations about which no cholera-related information is known (Figure 4). Here we show high uncertainty defined as having an entropy greater than 0.2, 0.3, and 0.4 to illustrate how the distribution of greyspots change as our understanding of uncertainty changes.

Figure 4: Cholera greyspots. The 5 km x 5 km grid cells that are colored are cells where we have reasonably confident seroincidence estimates (A. entropy < 0.2, B. entropy < 0.3, C. entropy < 0.4) or that are in the cholera surveillance zone. Grid cells in grey are greyspots, locations where we have almost no cholera information.

1. Relative seroincidence risk

Using data from a nationally-representative survey across Bangladesh, we developed maps of infection across 5 km x 5km grid cells in Bangladesh. This map estimates that from the 2015 serosurvey data collection there were 25651745.4614948 infections in Bangladesh over the past year out of an estimated population of 162611264.34846. We standardized the seroincidence estimates in each cell by the population-weighted mean of the Bangladesh infection risk in that cell. This yielded a relative risk of infection for each grid cell (Figure 5).

Figure 5: Median risk of V. cholerae seroincidence relative to a population-weighted mean by 5 km x 5 km grid cell. These relative risk estimates are bounded such that RRs above and below 2 and -2 were plotted as the values 2 and -2, repsectively. The black marks indicate sentinel hospital locations.

We describe the population living in each risk category (Table 3.)

Table 3: Population living in each risk category. The population living in high, moderate, and low risk areas according to relative seroincidence risk across Bangladesh.

Risk Level Population
High 15198981
Moderate 63939131
Low 83473152

We redrew the relative seroincidence risk map after binning grid cells into high, moderate, and low risk categories (Figure 6).

Figure 6: Cholera risk map as categorized by the risk of seroincidence relative to a population-weighted mean by 5 km x 5 km grid cell.

Relative risk in the cholera surveillance zone

We overlaid the cholera surveillance zone with the binned relative seroincidence risk maps to examine the distribution of risks in the surveilled areas (Figure 7 - Figure 8).

Figure 7: Relative seroincidence risk within cholera surveillance zones (10-20-30-30km for subdistrict, district, and tertiary care, and icddr,b hospitals).

Figure 8: Relative sero incidence risk outside of cholera surveillance zones (10-20-30-30km for subdistrict, district, tertiary care, and icddr,b hospitals).

Populations in the cholera surveillance zone

We examined the estimated number of infections, the percent infected in the cholera surveillance zone, and the percent of Bangladesh infections captured in the cholera surveillance zone (Table 4).

Table 4: Number and percent infections that may be captured in cholera surveillance zones. The percent infected represents the percentage of infected individuals captured within the cholera surveillance zone out of all infected individuals in Bangladesh.

Buffer size Surv. zone pop Number infected % infected in surv. zone pop % of all BGD infections
10-20-30-30 50887409 7047294 13.85 27.47

We then examined the distribution of relative-risk-based categories (Table 5).

Table 5: Number and percent infections that may be captured in cholera surveillance zones, categorized by relative risk. The infections in surveillance zone represents the percentage of infected people in high/moderate/low risk grid cells among all infections within the cholera surveillance zone. The population in surveillance zone represents the percentage of people living in high/moderate/low risk grid cells among all people within the cholera surveillance zone. The distribution should across risk categories should sum to 100% for each set of buffer sizes.

Buffer size Risk category Number infected % surv. zone infections Surv. zone pop % BGD pop in surv. zone
10-20-30-30 High 1532923 21.75 5657869 11.12
10-20-30-30 Moderate 1923614 27.30 10439618 20.52
10-20-30-30 Low 3590756 50.95 34789922 68.37

Populations by relative risk across Bangladesh

We sought to describe how well the cholera surveillance zones capture High, Moderate, Low populations across the population of Bangladesh.

We summarized the percentage of high, moderate, and low infection risk populations in Bangladesh that would be captured by cholera surveillance zones at different buffer sizes when risk was categorized by relative risk (Table 6).

Table 6: Number and percent infections that may be captured in Bangladesh, categorized by relative risk. The captured at-risk population represents the percentage of high/moderate/low risk populations captured by the cholera surveillance zone out of all high/moderate/low risk populations in Bangladesh. The captured infections represents the percentage of infections in high/moderate/low risk grid cells among all infections in high/moderate/low risk grid cells across Bangladesh.

Buffer size Risk category Surv. zone pop Surv. zone infections Captured At-Risk Pop (%) Captured Infections (%)
10-20-30-30 High 5657869 1532923 37.23 33.52
10-20-30-30 Moderate 10439618 1923614 16.33 16.63
10-20-30-30 Low 34789922 3590756 41.68 37.74

2. Estimated proportion of infections

The second measure of risk we examine to evaluate the surveillance system is the estimated proportion of the grid cell population that was infected with V. cholerae in the year prior to data collection. This yielded the median estimated proportion of infections for each grid cell (Figure 9).

Figure 9: Median estimated proportion of grid cell population infected with V. cholerae in the previous year. The black marks indicate sentinel hospital locations.

We describe the population living in each risk category defined by the 30th and 70th percentiles of the estimated proportion of infections in each grid cell (Table 7.)

Table 7: Population living in each risk category. The population living in high, moderate, and low risk areas according to the estimated proportion of infections across Bangladesh.

Risk Level Population
High 33921270
Moderate 64812197
Low 63877797

We redrew the estimated proportion of infections risk map after binning grid cells into high, moderate, and low risk categories (Figure 10).

Figure 10: Cholera risk map as categorized by the estimated proportion of V. cholerae infections by 5 km x 5 km grid cell.

Risk categories by proportion infected in the cholera surveillance zone

We overlaid the cholera surveillance zone with the binned infection proportion risk maps to examine the distribution of risks in the surveilled areas (Figure 11 - Figure 12).

Figure 11: Infection proportion risk categories within cholera surveillance zones (10-20-30-30km for subdistrict, district, and tertiary care, and icddr,b hospitals).

Figure 12: Infection proportion risk categories outside of cholera surveillance zones (10-20-30-30km for subdistrict, district, tertiary care, and icddr,b hospitals).

Populations in the cholera surveillance zone

We examined the estimated number of infections, the percent infected in the cholera surveillance zone, and the percent of Bangladesh infections captured in the cholera surveillance zone (Table 8).

Table 8: Number and percent infections that may be captured in cholera surveillance zones. The percent infected represents the percentage of infected individuals captured within the cholera surveillance zone out of all infected individuals in Bangladesh.

Buffer size Surv. zone pop Number infected % infected in surv. zone pop % of all BGD infections
10-20-30-30 50887409 7047294 13.85 27.47

We then examined the distribution of categories (Table 9) based on the proportion of infections.

Table 9: Number and percent infections that may be captured in cholera surveillance zones, categorized by the proportion infected. The infections in surveillance zone represents the percentage of infected people in high/moderate/low risk grid cells among all infections within the cholera surveillance zone. The population in surveillance zone represents the percentage of people living in high/moderate/low risk grid cells among all people within the cholera surveillance zone. The distribution should across risk categories should sum to 100% for each set of buffer sizes.

Buffer size Risk category Number infected % surv. zone infections Surv. zone pop % BGD pop in surv. zone
10-20-30-30 High 2329476 33.05 9465033 18.60
10-20-30-30 Moderate 1754864 24.90 11017629 21.65
10-20-30-30 Low 2962954 42.04 30404747 59.75

Populations by proportion infected across Bangladesh

We sought to describe how well the cholera surveillance zones capture High, Moderate, Low populations across Bangladesh.

We summarized the percentage of high, moderate, and low infection risk populations in Bangladesh that would be captured by cholera surveillance zones at different buffer sizes when risk was categorized by the infection proportion (Table 10).

Table 10: Number and percent infections that may be captured in Bangladesh, categorized by the proportion of individuals infected with V. cholerae . The captured at-risk population represents the percentage of high/moderate/low risk populations captured by the cholera surveillance zone out of all high/moderate/low risk populations in Bangladesh. The captured infections represents the percentage of infections in high/moderate/low risk grid cells among all infections in high/moderate/low risk grid cells across Bangladesh.

Buffer size Risk category Surv. zone pop Surv. zone infections Captured At-Risk Pop (%) Captured Infections (%)
10-20-30-30 High 9465033 2329476 27.9 27.65
10-20-30-30 Moderate 11017629 1754864 17.0 16.65
10-20-30-30 Low 30404747 2962954 47.6 44.29

3. Number of V.cholerae infections

The third measure of risk we examine to evaluate the surveillance system is the median estimated number of V. cholerae infections in each grid cell. (Figure 13).

Figure 13: Median number of estimated V. cholerae infections per grid cell in the previous year. The black marks indicate sentinel hospital locations.

We describe the population living in each risk category defined by the 30th and 70th percentiles of the number of infections in each grid cell (Table 11.)

Table 7: Population living in each risk category. The population living in high, moderate, and low risk areas according to the estimated proportion of infections across Bangladesh.

Risk Level Population
High 96887104
Moderate 56357517
Low 9366643

We redrew the estimated number of infections risk map after binning grid cells into high, moderate, and low risk categories (Figure 14).

Figure 14: Cholera risk map as categorized by the estimated number of V. cholerae infections by 5 km x 5 km grid cell.

Risk categories by number of infections in the cholera surveillance zone

We overlaid the cholera surveillance zone with the binned number of infections risk map to examine the distribution of risk in the surveilled areas (Figure 15 - Figure 16).

Figure 15: Number of infections risk categories within cholera surveillance zones (10-20-30-30km for subdistrict, district, and tertiary care, and icddr,b hospitals).

Figure 16: Number of infections risk categories outside of cholera surveillance zones (10-20-30-30km for subdistrict, district, tertiary care, and icddr,b hospitals).

Populations in the cholera surveillance zone

We examined the estimated number of infections, the percent infected in the cholera surveillance zone, and the percent of Bangladesh infections captured in the cholera surveillance zone (Table 12).

Table 12: Number and percent infections that may be captured in cholera surveillance zones. The percent infected represents the percentage of infected individuals captured within the cholera surveillance zone out of all infected individuals in Bangladesh.

Buffer size Surv. zone pop Number infected % infected in surv. zone pop % of all BGD infections
10-20-30-30 50887409 7047294 13.85 27.47

We then examined the distribution of risk categories (Table 13) based on the number of infections.

Table 13: Number and percent infections that may be captured in cholera surveillance zones, categorized by the number of infections. The infections in surveillance zone represents the percentage of infected people in high/moderate/low risk grid cells among all infections within the cholera surveillance zone. The population in surveillance zone represents the percentage of people living in high/moderate/low risk grid cells among all people within the cholera surveillance zone. The distribution should across risk categories should sum to 100% for each set of buffer sizes.

Buffer size Risk category Number infected % surv. zone infections Surv. zone pop % BGD pop in surv. zone
10-20-30-30 High 5259541 74.63 37576681 73.84
10-20-30-30 Moderate 1601849 22.73 11855989 23.30
10-20-30-30 Low 185903 2.64 1454739 2.86

Populations by number of infections across Bangladesh

We sought to describe how well the cholera surveillance zones capture High, Moderate, Low populations across Bangladesh.

We summarized the percentage of high, moderate, and low infection risk populations in Bangladesh that would be captured by cholera surveillance zones at different buffer sizes when risk was categorized by the number of V. cholerae infections (Table 14).

Table 14: Number and percent infections that may be captured in Bangladesh, categorized by the number of individuals infected with V. cholerae . The captured at-risk population represents the percentage of high/moderate/low risk populations captured by the cholera surveillance zone out of all high/moderate/low risk populations in Bangladesh. The captured infections represents the percentage of infections in high/moderate/low risk grid cells among all infections in high/moderate/low risk grid cells across Bangladesh.

Buffer size Risk category Surv. zone pop Surv. zone infections Captured At-Risk Pop (%) Captured Infections (%)
10-20-30-30 High 37576681 5259541.2 38.78 32.82
10-20-30-30 Moderate 11855989 1601849.0 21.04 19.33
10-20-30-30 Low 1454739 185903.4 15.53 13.89

Discussion